There are several existing verb lexicons for
general language that include a degree of syntactic information.  FrameNet \citep{framenet} and
VerbNet \citep{verbnet} are general language resources that indicate a
verb's predicate argument structure and the possible semantic roles of its
arguments.  The VALEX \citep{valex} general language lexicon was
produced using an earlier version of the system used in this study,
and is the largest SCF resource available for general language,
containing SCF and frequency information for some 6,400 verbs learned
from up to 10,000 sentences per verb.  PropBank \citep{propbank} is an
extension of the Penn TreeBank \citep{penntreebank} with information
about predicate-argument relationships.

A small number of verb lexicons already exist for biomedicine.  BioFrameNet\citep{dolbey:06} extends FrameNet with links to biomedical resources (e.g.~gene ontologies).  The UMLS SPECIALIST Lexicon \citep{mccray:94} includes verb subcategorization information for some 11,000 verbs, but is manually built from a variety of biomedical and general language dictionaries.  BioProp \citep{tsai:05} adds PropBank-style annotation to 500 abstracts from the GENIA corpus.  PASBio \citep{wattarujeekrit:04} is an inventory of predicate-argument structure frames for 30 verbs, focused on molecular biology.  The frames were constructed through expert examination of MEDLINE sentences, using guidelines similar to those of PropBank. The resource most relevant to this study is the BOOTStrep
BioLexicon \citep{biolexicon}, which produces verb subcategorization
data automatically. This system is described in full in Section~\ref{biolexicon}.

To the best of our knowledge there are no previous gold standards
available that are suitable for evaluation of automatically-acquired SCF lexicons for biomedical text.  As a general
principle, manually developed resources tend not to be sufficiently
comprehensive in their SCF coverage to serve as gold standards, due to
the rarity of many SCF types, which may be missed during the
introspective process of resource creation.  For example, the majority
of verbs considered in PASBio have just two attested frames, compared
to 9 for general language verbs in the gold standard associated with
VALEX, and 6 in the gold standards we produce here.  Manual resources
also lack the statistical information that is naturally gathered
during automatic production.  The BioLexicon, on the other hand, while produced from a
corpus, is unsuitable to be used as a gold standard because the output
has not been manually corrected. Moreover, the filtering used (a
relative frequency cutoff of 0.03), while suitable for removing noise
from a lexicon, is unsuitable for gold standard creation because many
SCFs are genuinely rare.
